A statistical approach for inferring the three-dimensional structure of the genome
نویسندگان
چکیده
Motivation: Recent technological advances allow the measurement, in a single Hi-C experiment, of the frequencies of physical contacts among pairs of genomic loci at a genomewide scale. The next challenge is to infer, from the resulting DNA-DNA contact maps, accurate three dimensional models of how chromosomes fold and fit into the nucleus. Many existing inference methods rely upon multidimensional scaling (MDS), in which the pairwise distances of the inferred model are optimized to resemble pairwise distances derived directly from the contact counts. These approaches, however, often optimize a heuristic objective function and require strong assumptions about the biophysics of DNA to transform interaction frequencies to spatial distance, thereby leading to incorrect structure reconstruction. Methods: We propose a novel approach to infer a consensus three-dimensional structure of a genome from Hi-C data. The method incorporates a statistical model of the contact counts, assuming that the counts between two loci follow a Poisson distribution whose intensity decreases with the physical distances between the loci. The method can automatically adjust the transfer function relating the spatial distance to the Poisson intensity and infer a genome structure that best explains the observed data. Results: We compare two variants of our Poisson method, with or without optimization of the transfer function, to four different MDS-based algorithms—two metric MDS methods using different stress functions, a nonmetric version of MDS, and ChromSDE, a recently described, advanced MDS method—on a wide range of simulated datasets. We demonstrate that the Poisson models reconstruct better structures than all MDS-based methods, particularly at low coverage and high resolution, and we highlight the importance of optimizing the transfer function. On publicly available Hi-C data from mouse embryonic stem cells, we show that the Poisson methods lead to more reproducible structures than MDS-based methods when we use data generated using different restriction enzymes, and when we reconstruct structures at different resolutions.
منابع مشابه
Bayesian approach to inference of population structure
Methods of inferring the population structure, its applications in identifying disease models as well as foresighting the physical and mental situation of human beings have been finding ever-increasing importance. In this article, first, motivation and significance of studying the problem of population structure is explained. In the next section, the applications of inference of p...
متن کاملThree-dimensional quantitative structure activity relationship approach series of 3-Bromo-4-(1-H-3-Indolyl)-2, 5-Dihydro-1H-2, 5- Pyrroledione as antibacterial agents
The use of quantitative structure–activity relationships, since its advent, has becomeincreasingly helpful in understanding many aspects of biochemical interactions in drug research.This approach was utilized to explain the relationship of structure with biological activity ofantibacterial. For the development of new fungicides against, the quantitative structural–activityrelationship (QSAR) an...
متن کاملThree-dimensional elasticity solution for vibrational analysis of thick continuously graded sandwich plates with different boundary conditions using a two-parameter micromechanical model for agglomeration
An equivalent continuum model based on the Eshelby-Mori-Tanaka approach was employed to estimate the effective constitutive law for an elastic isotropic medium (i.e., the matrix) with oriented straight carbon nanotubes (CNTs). The two-dimensional generalized differential quadrature method was an efficient and accurate numerical tool for discretizing equations of motion and for implementing vari...
متن کاملThermoelastic Interaction in a Three-Dimensional Layered Sandwich Structure
The present article investigates the thermoelastic interaction in a three-dimensional homogeneous and isotropic sandwich structure using the dual-phase-lag (DPL) model of generalized thermoelasticity. The incorporated resulting non-dimensional coupled equations are applied to a specific problem in which a sandwich layer of unidentical homogeneous and isotropic substances is subjected to time-de...
متن کاملMethodology for Inferring Moral Priorities According to the Narrations of "Afal Tafzil"
Considering the different levels of moral values in Islam, in order to know the most important values and also to eliminate the contradiction, it is necessary to deduce from the texts of verses and hadiths. One of the most important aspects in these texts is the "structure of Tafzil". Some narrations of this structure indicate the priority of one or more values and others indicate a rule in det...
متن کاملEvaluation of First and Second Markov Chains Sensitivity and Specificity as Statistical Approach for Prediction of Sequences of Genes in Virus Double Strand DNA Genomes
Growing amount of information on biological sequences has made application of statistical approaches necessary for modeling and estimation of their functions. In this paper, sensitivity and specificity of the first and second Markov chains for prediction of genes was evaluated using the complete double stranded DNA virus. There were two approaches for prediction of each Markov Model parameter,...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2017